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Abstract 


Interdependence  of  tasks  in  a  mission  necessitates  information  flow  among  the  organizational  elements 
(agents)  assigned  to  these  tasks.  This  information  flow  introduces  communication  delays.  An  effective 
task  schedule  that  minimizes  the  total  execution  time,  including  task  processing  and  coordination  delays, 
is  an  important  issue  in  designing  an  organization  and  its  task  processing  strategy.  This  paper  defines  the 
structure  of  information-dependent  tasks,  and  describes  an  approach  to  map  this  structure  to  a  network 
of  organizational  elements  (agents). 

Since  the  general  problem  of  scheduling  tasks  with  communication  is  NP-hard,  only  fast  heuristic 
(list  scheduling  and  linear  clustering)  algorithms  are  discussed.  We  modify  the  priority  calculation  for 
list  scheduling  methods,  matching  the  critical  path  with  a  network  of  heterogeneous  agents.  We  present 
our  algorithm,  termed  Heterogeneous  Dynamic  Bottom  Level  (HDBL),  and  compare  it  with  various  list¬ 
scheduling  heuristics.  The  results  show  that  HDBL  exhibits  superior  performance  to  all  list  scheduling 
algorithms,  providing  an  improvement  of  over  25%  in  schedule  length  for  communication-intensive  task 
graphs. 


1  Introduction 

Contingency  theorists  argue,  and  the  empirical  research  confirms,  that  a  proper  organizational  design  is 
critical  to  superior  organizational  performance  ([Burton98],  [Entin99],  [Hocevar99]).  Setting  up 
efficient  organizational  processes  is  one  of  the  keys  to  a  successful  organizational  design.  Formalizing 
the  team  processes  provides  a  basis  for  identifying  the  design  parameters  that  can  be  optimized  to 
improve  team  performance. 

From  a  systems  theory  viewpoint,  an  organization  is  an  open  system.  It  can  be  modeled  by 
specifying  several  key  entities  including:  (i)  the  environment;  (ii)  the  organizational  elements 
(sometimes  termed  agents  or  processors);  (iii)  the  organizational  structure;  (iv)  the  organizational 
processes;  and  (v)  the  organizational  outcomes.  While  mission  decomposition  into  tasks  provides  a  basis 
for  balancing  the  effort  among  agents,  the  input-output  transformations  that  link  tasks  define  the  “flows” 
within  the  organization  and/or  between  the  organization  and  its  environment.  Two  important  examples 
of  such  flows  include  the  information  flow  (specifying  communication  among  the  agents)  and 
commodity  flow  (e.g.,  the  production  cycle  that  transforms  the  raw  materials  into  ready-to-sell 
products).  The  corresponding  flows  (from  hereon  termed  process  flows)  characterize  the  organizational 
processes  by  specifying  input-output  relationships  among  the  organizational  elements. 

The  problem  of  optimizing  team  processes  can  be  decomposed  into  three  parts:  (1)  optimizing  the 
functional  allocation  strategy  to  achieve  desired  goal  states;  (2)  decomposing  functions  into  sets  of 
interdependent  tasks;  and  (3)  mapping  the  tasks  and  their  process  flows  onto  an  organization  to  optimize 
the  processing  cost.  The  solution  to  the  first  of  these  three  problems  from  two  different  perspectives, 
dynamic  Bayesian  networks  and  Markov  Decision  Processes,  is  presented  in  [Meirina2002]  and 
[Tu2002].  An  important  feature  of  the  process  flows  is  that  the  flow  medium  can  change  its  content  and 
volume  after  passing  through  any  of  the  processing  nodes  (e.g.,  the  information  can  be  filtered  or  fused). 

A  coherent,  timely,  and  efficient  team  process  greatly  improves  the  chances  for  successful  team 
performance.  Therefore,  mapping  process  flows  onto  an  organization  is  an  important  issue  affecting 
team  performance  [Fevchuk2002],  since  it  specifies  two  sets  of  variables:  (i)  an  allocation  of  tasks  to 
agents  (or  system  elements),  and  (ii)  requirements  for  the  flow  of  information  among  agents.  The  task 


processing  schedule  and  flow  routing  can  be  optimized  to  minimize  schedule  inefficiencies  (i.e.,  delays), 
utilized  resources,  expended  energy,  coordination  overhead,  and  so  on. 

The  problem  of  scheduling  tasks  on  a  processor  architecture  is  of  considerable  importance  in  parallel 
processing.  The  general  problem  of  scheduling  a  task  graph  to  minimize  the  total  parallel  time  is  NP- 
hard  (no  polynomial-time  algorithm  exists  to  find  an  optimal  solution).  Even  if  communication  among 
task  nodes  is  zero,  we  obtain  a  simplified  scheduling  problem  that  is  still  NP-hard.  For  this  problem,  it 
was  shown  that  any  list  scheduling  heuristic  is  within  50%  of  the  optimal  solution  (see  [Graham66]).  It 
was  later  empirically  demonstrated  ([Adam74])  that  the  critical  path  list  scheduling  method  has  even 
better  performance:  its  solution  is  within  5%  of  the  optimum  in  90%  of  cases. 

In  the  presence  of  inter-task  communication,  however,  the  problem  becomes  much  harder.  List 
scheduling  no  longer  has  the  50%  performance  guarantee.  The  problem  of  scheduling  tasks  with 
communication  has  received  much  attention  [Wu88],  [Kim88],  [Sarkar89],  [McCreary90],  [El- 
Rewini90a&b],  [Gerasoulis90a&b],  [Sih93]  since  it  can  be  used  for  scheduling  tasks  on  message¬ 
passing  architectures.  The  one-stage  approach  (assigning  tasks  to  physical  agents)  of  list  scheduling 
method  was  contrasted  with  two-stage  methods  in  which  the  first  stage  performs  reduction  clustering 
and  preprocessing  that  explore  the  topology  of  communication  graph  regardless  of  agent  constraints. 
However,  the  problem  becomes  significantly  complex  when  various  constraints  are  introduced,  and  the 
two- stage  methods  can  no  longer  be  applied. 

This  paper  is  organized  as  follows.  The  problem  of  scheduling  information  tasks  onto  an  agent 
structure  is  formulated  in  section  2.  Section  3  outlines  scheduling  and  mapping  variables  used  for  our 
problem.  The  mapping  and  scheduling  feasibility  conditions  are  discussed  in  section  4.  The  solution 
approach,  together  with  related  research,  is  presented  in  sections  5-6.  List  scheduling  method  is 
discussed  in  section  6,  with  previous  work  outlined  in  subsection  6.1  and  our  algorithm  presented  in 
subsection  6.2.  Section  7  presents  simulation  results.  Conclusions  and  future  extensions  are  given  in 
section  8.  The  issue  of  mapping  a  critical  path  onto  a  heterogeneous  system  is  presented  in  Appendix  A, 
and  an  efficient  algorithm  for  scheduling  multiple  information  messages  with  release  times  onto  agents’ 
communication  link  is  outlined  in  Appendix  B. 

This  paper  provides  the  following  contributions  to  the  methodology  of  scheduling  task 
communication  graphs  onto  a  heterogeneous  system  of  agents: 

•  The  agent  network  structure  is  utilized  to  find  the  dynamic  critical  path,  the  earliest  possible  start 
time  and  the  latest  possible  finish  time  of  a  task  on  an  agent  (Appendix  A); 

•  The  dynamic  critical  path  mapped  to  agent  network  is  utilized  to  compute  agent-task  priorities  in 
HDBL  algorithm  that  outperforms  conventional  list-scheduling  methods;  and 

•  Efficient  algorithm  is  developed  for  scheduling  multiple  information  messages  with  release  times 
onto  an  agent’s  communication  link  (Appendix  B). 


2  Problem  Statement 

The  goal  of  an  organization  is  to  complete  assigned  missions  in  the  most  efficient  manner.  Each  mission 
can  be  decomposed  into  a  set  of  tasks  with  specific  constraints  [Levchuk2000a],  [Kapasouris91].  One  of 
the  constraints  of  the  mission  is  a  particular  ordering  in  which  the  tasks  must  be  completed  (a  mission 
plan).  An  organization  consists  of  agents  (processors,  human  decision-makers,  etc.).  Each  agent  has 
certain  resources  available  to  it,  which,  together  with  training  (for  human  organizations),  determine  the 


expertise  to  carry  out  assigned  tasks/processes.  Generally,  agents  have  different  capabilities  to  perform 
tasks.  Processing  elements  with  different  capabilities  are  termed  heterogeneous. 

This  paper  addresses  the  problem  of  finding  the  optimal  allocation  (scheduling)  or  mapping  of  tasks 
with  communication  requirements  (called  information  tasks)  to  heterogeneous  agents  (organizational 
elements)  while  satisfying  various  constraints.  The  objective  function  is  to  minimize  the  completion 
time  of  the  mission,  that  is,  the  finish  time  of  the  terminal  task  (also  termed  the  makespan). 

TABLE  I. 

Task  Attributes 

q  _  (y  £  )  directed  acyclic  graph  of  tasks  with 
precedence  constraints 

Tt  e  V,  task  node 

e\  j  =<Ti,Tj  >e  E,  a  precedence  arc  in  the  task  graph 

f  amount  of  information  transmitted  between 

J  ij 

tasks  f  and  T.  along  the  arc  e[ 
w,  task  processing  load  ( workload ) 

m,  task  memory  load 


Information  tasks  are  modeled  via  directed  acyclic  information  graph  Gt  =(Vt,Et),  where 
v,  ={r,,/  =  i,...,A}  is  the  set  of  task  nodes,  N  =  \V,  |  is  the  number  of  nodes,  Et  =  {e-  .  =<  T  ,  7j  >}  is 
the  set  of  directed  edges,  and  e,  =  \E,\  is  the  number  of  edges.  Edges  in  the  graph  correspond  to 
communication  messages  and  precedence  constraints  among  tasks.  Amount  of  information  (weight  of 
communication)  transmitted  from  task  Ti  to  T .  (incurred  along  the  edge  e\  .  =<  Ti ,Tj  > )  is  denoted  by 
fij ,  which  becomes  zero  if  both  tasks  are  allocated  to  the  same  agent.  For  each  task  Ti ,  a  processing 
load  (or  workload )  w,  and  memory  load  mi  are  defined.  Task  attributes  are  outlined  in  Table  I.  Fig.  1 

shows  an  example  of  a  task  graph,  with  weights  on  the  arcs  representing  the  information  flow 
transferred  between  the  concomitant  tasks,  and  weights  on  the  task  nodes  indicating  task  workload  and 
memory  load. 

The  agents  are  modeled  via  another  dependency  graph.  The  agent  structure  is  defined  by  an 
undirected  graph  Ga  =  (Va ,  Ea ) ,  where  Va  =  {Ar,  r  =  1,...,  K}  is  the  set  of  agent  nodes  and  K  -  \Va  is 

the  number  of  nodes,  and  Ea  =  {earu  =<  Ar ,  Au  >}  is  the  set  of  undirected  communication  links  among 
agents  with  transfer  rate  cr  u ,  and  ea  =  | Ea  |  is  the  number  of  links.  For  each  agent  Ar ,  a  processing  (or 
workload)  capacity  Wr  and  a  memory  capacity  Mr  are  defined,  and  the  time  to  process  task  Tj  is  pri 
( Prj  =  00  if  the  agent  cannot  process  this  task;  we  assume  that  pr  i  =  if  Wr  <  w;. ). 

Agent  attributes  are  outlined  in  Table  II.  Fig.  2  shows  an  example  of  an  agent  network,  with  weights 
on  the  arcs  representing  the  rate  of  information  transfer  between  the  corresponding  agents,  and  weights 
on  the  agent  nodes  indicating  agent  workload  and  memory  capacity.  Table  III  shows  the  agent-task 
processing  time  matrix. 


The  execution  model  works  as  follows  (for  a  similar  macro-dataflow  model,  see  [Sarkar89], 
[Wu88]).  The  data  flow  triggers  the  execution  of  tasks.  A  task  receives  all  data  from  its  predecessors  in 
parallel.  It  then  executes  without  interruption  (non-preemptively)  and  immediately  after  completion  it 
sends  the  data  to  all  successors  in  parallel.  In  this  model,  task  execution  and  agent  communication  are 
done  in  parallel  subject  to  constraints  on  workload  and  memory  capacities,  and  communication 
contention. 


TABLE  II. 

Agent  Attributes 

q  =  (y  £  )  non-directed  graph  of  agents  with 
communication  links 

Ar  e  Va  agent  node 

e“u  =<  Ar,Au  >e  Ea  a  communication  link 

cm  rate  of  information  transfer  between 

agents  Ar  and  Au  along  the  arc  e“ru 

pr  i  time  required  to  process  task  Tt 

W  agent  processing  ( workload )  capacity 

Mr  agent  memory  capacity 


TABLE  III. 


Agent-Task  Processing  Time 


Task  Tl  Task  T2  Task  T3  Task  T4  Task  TS  Task  T6 

Agent  Al 
Agent  Al 
Agent  A3 

Agent  A4 

Agent  A5 

3  1  2  1  oo  3 

oo  2  °o  1  oo  1 

1.5  1  oo  oo  oo  0.5 

oo  oo  oo  3  0.5  oo 

1  oo  1  OO  2  2 

The  processes  of  an  organization  assigned  to  execute  a  mission  consisting  of  information  tasks  can 
be  conceptualized  as  follows: 

□  Task  execution  (processing)  by  organizational  agents. 

□  Agent  communication  -  routing  task  information  flow  among  agents. 

□  Storing  of  tasks  in  the  agent’s  memory. 


2.1  Task  Execution 

Every  task  is  allocated  to  a  single  agent  capable  of  processing  this  task.  When  a  task  7).  is  processed  by 
an  agent  Ar ,  the  latter’s  workload  is  increased  by  w;  units.  Agents  can  generally  process  more  than  one 
task  at  a  time,  but  the  dynamic  workload  (total  load  of  simultaneously  processed  tasks)  of  any  agent  Ar 
must  not  exceed  agent’s  workload  capacity  Wr .  A  task  can  begin  to  be  processed  by  an  agent  when  all 
the  predecessors  of  a  task  have  been  completed  and  all  the  information  flow  from  them  was 
communicated  to  this  agent.  An  example  of  task  processing  under  an  agent’s  workload  capacity 
constraint  is  shown  in  Fig.  3. 


Figure  3:  Task  Processing  Figure  4:  Agent  Communication 


2.2  Agent  Communication 


If  tasks  7)  and  T j  are  assigned  to  different  agents,  information  /’  .  must  be  communicated  between 
these  agents  in  the  organization  (communication  is  zero  if  these  tasks  are  assigned  to  the  same  agent). 
The  agents  can  communicate  only  one  message  at  a  time.  The  time  required  to  communicate  f]  .  units 


of  information  from  agent  Ar  to  Au  along  the  link  earu 


is  equal  to 


if  cr  ±  0  .  We  could  generalize 


the  problem  formulation  by  making  this  time  dependent  on  tasks  and  on  the  link  between 
communicating  agents. 


We  assume  that  only  connected  agents  communicate,  and  if  cru  =  0 ,  then  communication  between 

these  agents  cannot  happen.  Another  approach  is  to  allow  such  communication  to  occur  through  the 
shortest  path  between  these  agents  in  the  network,  assuming  that  the  agent  network  is  fully  connected.  In 
this  case,  the  most  efficient  routing  of  information  should  be  performed  dynamically  to  account  for 
communication  link  contention.  An  example  of  agent  communication  due  to  task  information  flow  is 
shown  in  Fig.  4. 


2.3  Task  Storing 

The  storing  of  task  Ti  (in  the  agent’s  memory)  is  required  if: 

a)  Task  T  and  its  successor  task  Tj  (the  task  that  requires  information  from  7) )  are  assigned  to  the 
same  agent  Ar ;  in  this  case,  the  dynamic  memory  load  of  agent  Ar  is  increased  by  mj  units  from  the 
finish  time  of  T  until  the  start  time  of  T j ; 

b)  Task  Tj  is  assigned  to  agent  Ar,  but  its  successor  task  Tj  is  assigned  to  agent  Au  (u  ^  r );  in  this 
case,  the  dynamic  memory  load  of  agent  Ar  is  increased  by  mi  units  from  the  finish  time  of  Ti  until 
the  time  communication  of  information  /;  .  is  initiated  from  agent  Ar  to  Au ,  and  a  dynamic  memory 
load  of  agent  Au  is  increased  by  mi  units  from  the  time  information  f]  .  is  received  from  agent  Ar 
until  the  start  time  of  task  T . . 


The  dynamic  memory  load  of  any  agent  Ar  must  not  exceed  its  memory  threshold  M r.  An  example  of 
agent  communication  and  task  storage  is  shown  in  Fig.  5. 
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Figure  5:  Task  Storing 


TABLE  IV. 

Mapping  and  Scheduling  Variables 


x  assignment  variables  (  =1  iff  task  7]  is  assigned  to 
agent  Ar  ) 

ST(i)  start  time  °f  task  T-, 

FT(i)  finish  time  of  task  7" 

SF ((',  j)  start  time  of  transfer  of  information  between  tasks 
T.  and  T 

FF(i,  j)  finish  time  of  transfer  of  information  between 
tasks  71  and  T] 

W(r,t)  workload  of  agent  Ar  at  time  t 
M(r,t)  memory  load  of  agent  A,,  at  time  ? 

E(r  u)  set  °f  messages  passed  from  agent  ^  to  agent  A 


The  objective  is  to  find  a  mapping  of  task  structure  onto  agents’  network  and  the  corresponding  task 
schedule  that  minimize  the  mission  completion  time  ( makespan )  -  the  completion  time  of  the  last  task. 
This  problem  can  be  viewed  as  consisting  of  three  parts: 

1.  Allocation  of  tasks  to  agents. 

2.  Sequencing  of  task  execution  for  each  agent. 

3.  Sequencing  of  communication  (due  to  task  information  flow)  in  agents’  network. 


3  Mapping  Variables 

In  Table  IV,  we  list  the  variables  used  to  define  the  solution  to  the  mapping  and  scheduling  problem. 

We  define  other  relevant  variables  to  facilitate  the  explanations: 

•  a(i)  is  an  agent  to  which  task  7j  is  assigned  (xa{jy  =1); 

•  ST(j,u )  =  (earliest)  start  time  of  task  T  on  agent  A((; 

•  FT(j,u )  =  (earliest)  finish  time  of  task  Tj  on  agent  Au ; 

•  SF(i,  j,r,u )  =  start  time  for  transfer  of  information  from  task  T  to  Tj  on  the  link  between  agents 
\  and  Au ; 

•  FF(i,  j,  r,  u)  =  finish  time  for  transfer  of  information  from  task  7j  to  T.  on  the  link  between  agents 
Ar  and  Au  ; 

•  OUT(i)  =  {  /  :  3  e\  . .  e  Et  }  -  a  set  of  immediate  successors  of  task  7). ; 


•  IN(  j)  =  {i  :  3  e\ .  e  Et }  -  a  set  of  immediate  predecessors  of  task  7\ ; 

•  lh(i,k)  =  length  of  the  longest  assignment  among  all  minimum  schedules  of  paths  leading  from  task 
T  assigned  to  agent  At  to  the  end  of  the  graph;  lh(i)  is  the  smallest  among  them  (sometimes  called 
bottom  level  of  a  node):  lh(i)  =  min lh(i,  m) ; 

m 

•  CP(i,k )  =  a  critical  path  starting  with  task  Tj  assigned  to  agent  Ak  (a  sequence  of  tasks  that 
corresponds  to  lt  (i,  k) );  CP(i)  is  the  shortest  path  among  CP(i,  k ) : 

CP{i)  =  CP{i,  nr'),  m*  =  arg  min  lb  (i,  m)  ( 1 ) 

m 

•  CA(i,  k)  =  a  sequence  of  agents  that  are  scheduled  a  critical  path  CPU,  k) ;  CA(i)  is  the  shortest  such 
schedule  ( CA(i)  =  CA(i,  m*) ,  m*  -  arg  min  lh  ( i ,  m) ); 


•  lt(j,m )  =  length  of  the  longest  assignment  among  all  minimum  schedules  of  paths  leading  from 
start  of  the  task  graph  to  task  7\  assigned  to  agent  Am  not  counting  the  execution  time  of  task  Tj ; 
lt  ( j )  is  the  smallest  among  them  (sometimes  called  top  level  of  a  node):  l,  ( i )  =  min  l,  (i,  m) . 


Note  that  FT(j,u )  =  ST(j,u)+  pu  - ,  and 


FF(i,j,r,u ) 


fi  ■ 

SF  (i,  j,  r,u)  +  — ,  if  r*u,cru*  0 

C  r,u 

■  if  r  ±  u,cr  u  =  0 

SF(i,  j,  r,  u)  =  FT (i,  r),  otherwise 


(2) 


Static  calculation  and  dynamic  update  of  lb  (i,  k )  and  /,  (  j,  m)  are  presented  in  Appendix  A. 


4  Mapping  Feasibility 

To  find  whether  the  problem  of  mapping  a  task  graph  onto  a  network  of  agents  is  feasible,  we  need  to 
test  the  following  conditions: 

a)  agent-to-task  assignment  feasibility  -  for  each  task  there  exist  an  agent  that  can  execute  this  task: 

V  Tt  3  Ar  :  pri  ±  °o  and  Wr  >  w;.  (3) 

b)  agent’s  workload  capacity  constraints  -  at  any  time  t  during  mission  processing  the  total  workload  of 
an  agent  must  not  exceed  agent’s  workload  capacity: 

Wr>W  (r,  t)  =  £w,. 

i: 

xr  i=l,  ST(i)<t,  FT(i)>t 


(4) 


c)  information  transfer  feasibility  -  each  information  message  can  be  communicated  in  agent’s  network 
(only  directly  connected  agents  can  communicate  and  the  transfer  of  information  between  agents 
according  to  task  information  flow  occurs  without  interruption): 

v  elj  =  (Ti  ,Tj)e  E,,  ftJ  *0  3Pr,Pu  =>  pn  *  °°,  pu.  *  oo,  and  cru  *  0  (5) 

d)  information  scheduling  feasibility  -  information  link  can  transmit  only  one  message  at  a  time: 

for  E(r,  u)  =  },  Vk  =  1,...,  m- 1  =>  FF(ik ,  jk )  <  SF(ik+l ,  jk+1 )  (6) 

e)  memory  allocation  feasibility  -  at  any  time  t  during  mission  processing,  the  memory  load  of  an  agent 
(subsection  2.3)  does  not  exceed  agent’s  memory  capacity: 

6  4  4  T" 4  4  8  6  4  44  T  4  4  48 

Mr  >M(r,t)  =  Yjmi  + 

ejjsEt:  efjsE, : 

xri=\,  FT(i)<t,  SF(i,j)>t  xrJ=  1,  FF(i,j)<t,  ST(j)>t 

f)  precedence  constraints  -  a  task  can  start  execution  only  after  all  of  its  predecessors  are  finished  and  all 
the  information  is  communicated  to  the  corresponding  agent: 

V  eE  e  Et,  xr  i  =  1,  xuj  =  1  we  have: 


ET(i)  =  ST(i)  +  pri  <  SF(i,  j )  <  FF  (i,  j ) 


FF(i,  j)  = 


SF(i,  j)  +  <  ST  (j),  if  u*r 

C r,u 

SF(i,  j)  =  ST ( i ),  otherwise 


(8) 

(9) 


To  find  the  feasibility  of  scheduling  a  task  graph  on  an  agent  architecture,  we  need  to  run  the  critical 
path  algorithm  for  a  heterogeneous  network  (Appendix  A).  If  there  exists  i  such  that  lh  (i)  =  °°,  then  the 
task  graph  scheduling  is  infeasible.  The  local  dynamic  feasibility  of  a  schedule  is  maintained  by  always 
allocating  a  task  i  to  an  agent  Ar  such  that  lh  (,r)^  .  Coefficients  lb  (/,  r )  are  computed  off-line  as  in 

Appendix  A. 


5  Solution  Approach 

The  problem  defined  in  section  2  is  NP  hard  in  very  simple  cases  ([Garey79],  [El-Rewini90b]). 

As  mentioned  in  section  2,  the  scheduling  problem  can  be  thought  of  as  consisting  of  three  parts:  1) 
the  task-to-agent  assignment;  2)  the  task  execution  ordering  within  an  agent;  and  3)  sequencing  of 
information  transfer  (information  routing)  in  agents’  network.  A  list  scheduling  heuristic  applied  to  task 
and  information  scheduling  solves  all  three  problems  at  once  (one -stage  method).  While  low  complexity 
one- stage  methods  such  as  the  Critical  Path  (CP)  algorithm  perform  very  well  when  communication 
delays  are  zero,  this  is  not  the  case  with  non-zero  communication  delays.  This  is  because  the  edge 
weights  are  no  longer  deterministic;  they  are  functions  of  task-to-agent  assignment  (communication 
would  be  zero  when  two  tasks  are  assigned  to  the  same  agent,  and  non-zero  when  they  are  assigned  to 
different  agents).  Consequently,  task  priorities  depend  on  task-to-agent  assignment,  and  cannot  be 
accurately  estimated  using  a  one-stage  method.  In  two-stage  methods,  once  the  mapping  of  tasks  to 


agents  is  obtained,  the  communication  pattern  becomes  deterministic,  and  a  better  priority  ordering 
among  tasks  can  be  derived. 

In  general,  the  existing  algorithms  for  static  scheduling  of  parallel  programs  represented  by  a  macro- 
dataflow  graph  to  the  set  of  processors  can  be  classified  into  three  categories: 

•  Bounded  number  of  processors  (BNP)  scheduling 

•  Unbounded  number  of  clusters  (UNC)  scheduling 

•  Arbitrary  processor  network  (APN)  scheduling 

The  first  class  of  algorithms  is  limited  to  the  fully  connected  processor/agent  structures;  they  do  not 
consider  link  contention.  The  second  class  of  algorithms  employs  hierarchical  clustering  of  tasks;  these 
algorithms  do  not  account  for  non-homogeneousness  of  the  processing  architecture.  The  last  class 
performs  scheduling  of  tasks  onto  processors  and  communication  onto  network  channels;  these 
algorithms  consider  link  contention. 

BNP  algorithms  are  modified  to  account  for  network  topology  and  can  be  employed  to  solve  our 
problem.  On  the  other  hand,  only  linear  clustering  methods  of  UNC  scheduling  can  be  used  for  non- 
homogeneous  networks  of  agents. 

The  algorithms  considered  here  could  be  decomposed  into  two  groups:  (;)  list  scheduling  algorithms, 
and  (ii)  clustering  algorithms  (with  message  routing).  The  list-scheduling  algorithms  assign  tasks  and  the 
corresponding  communication  one-by-one  in  a  topological  order  obtained  from  the  task  graph. 
Clustering  algorithms  assign  sets  of  tasks  onto  clusters,  dynamically  changing  the  original 
communication  graph.  List  scheduling  can  be  improved  by  utilizing  the  insertion  heuristic  [Gan96]. 

The  following  definitions  are  used  in  the  descriptions  of  the  algorithms: 

•  pi  =  median  task  processing  time  (equal  to  the  largest  feasible  processing  time  if  3  k  such  that 

pk,i  = 00 ); 


•  Pi  =  mean  task  processing  time  (equal  to  the  largest  feasible  processing  time  if  3  k  such  that 

pk,i  = 00 ); 


•  c  =  mean  communication  link  rate:  c  = - 

I  Ea\ 


6  List  Scheduling  Heuristics 

List  scheduling  (or  priority  scheduling)  algorithms  define  priorities  of  tasks  (either  static  or  dynamic), 
and  allocate  “ready”  tasks  (that  have  all  predecessors  assigned  to  agents)  in  the  decreasing  order  of 
priority  pr(i )  accounting  for  agent  idle  times  and  network  topology.  The  information  message  routing 
performance  is  included  in  computing  a  task’s  earliest  start  time  for  each  agent. 

Earlier  research  was  concerned  with  specifics  of  implementation,  and  it  was  agreed  that  the 
following  features  improve  the  list  scheduling  algorithm’s  performance: 

•  List  scheduling  without  a  universal  time  clock  is  used  (tasks  are  considered  according  to  their 
assigned  predecessors  -  not  just  completed  ones); 


•  Insertion  is  used  (search  for  idle  time  slots  is  performed); 

•  All  agents  are  considered  (not  only  “free”  agents  at  the  current  time). 

TABLE  V. 

Initialize:  READY  =  {/ :  IN(i)  =  0} 

Step  1:  Select  task  i  =  arg  max  pr(j)  =>  READY  <-  READY \{i) 

js  READY 

Step  2:  Select  agent  r  =  arg  min  FT(i,u) 

Step  3:  Assign  task  iT  to  agent  Ar 

Step  4:  Update  successors’  data  (including  priority  info): 

V/  e  OUT(i)  :  IN(j)  <-  IN(j)\{i]  and  if  IN (j)  =  0  =>  READY  <-  READY  u{f) 

Step  5:  Repeat  steps  1-4  until  READY=0  . 


Priority  list  scheduling  algorithm  is  outlined  in  Table  V.  List  scheduling  algorithms  differ  by  the  method 
used  to  calculate  the  priorities  of  tasks  (Step  1),  agent  selection  (Step  2),  and  information  routing 
strategy  used  in  Step  3.  We  outline  various  existing  list-scheduling  methods  in  subsection  6.1,  and 
present  our  algorithm  in  subsection  6.2. 


6.1  Related  Work 


6.1.1  Task  Priority  Selection  (Step  1) 

Mapping  Heuristic  (MH).  In  a  slightly  different  formulation  of  the  problem,  the  following  algorithms 
describe  the  basic  notion  of  assigning  tasks  according  to  the  longest  length  of  the  critical  path  from  a 
task  node  to  the  end  of  the  graph:  Modified  Critical  Path  (MCP)  [Wu88],  Mapping  Heuristic  (MH)  [El- 
Rewini90a,b],  [Gan96],  and  Heterogeneous  Earliest  Finish  Time  (HEFT)  [Topcuoglu99].  The  idea  is  to 
select  a  task  with  the  highest  priority  defined  as  a  static  upward  rank  blevelii )  that  is  equal  to  the  length 
of  the  longest  exit  path  from  task  Ti  (the  computation  is  based  on  the  mean  computation  and 
communication  costs).  For  a  heterogeneous  system,  it  can  be  iteratively  calculated  as  in  [Topcuoglu99]: 


blevel(i)  =  /?,  +  max  [blevel(j)  + 

jeOUT(i) 


(10) 


The  task  is  assigned  to  an  agent  that  minimizes  the  finish  time  of  the  task. 

Dominant  Sequence  (DS).  Algorithms  such  as  Mobility  Directed  (MD)  [Wu88],  Dominant  sequence 
clustering  (DSC)  [Gerasoulis90a,b&95]  and  Critical-path-on-a-processor  (CPOP)  [Topcuoglu99]  assign 
tasks  in  decreasing  priority  defined  as  the  sum  of  upward  and  downward  rank: 

pr(i )  =  tlevel(i)  +  blevel(i) ,  where  tlevel(i )  is  the  length  of  the  longest  path  from  the  entry  (top)  node  to 
the  task  node  7)  not  including  the  computation  cost  of  7  .  The  tlevel(i )  is  calculated  for  heterogeneous 
systems  in  a  similar  fashion  as  blevel(i ) : 


tlevel(j)=  max  [tlevel(i)  + 

ielN(j) 


-  fij  , 

C 


(11) 


Note  that  tasks  in  the  same  critical  path  would  have  the  same  priorities.  CPOP  algorithm  differs  from 
others  by  fixing  the  assignment  of  tasks  on  the  critical  path  to  the  same  agent  (that  completes  this  path 
the  fastest). 

Dynamic  Critical  Path  (DCP).  DCP  [Kwok94]  differs  from  DSC  and  MD  by  restricting  certain 
assignments  obtained  from  a  “look-ahead”  strategy  and  load  balancing.  DCP  computes  priority 
characteristics  tlevel  and  blevel  dynamically  by  “matching”  the  task  graph  to  the  agent  system.  These 
values  are  computed  as  follows: 


tleveli  j,  m)  =  max  pa(i  +  tlevelii,  a(i ))  +  /  (  )  •  — 


f.  ■ 

blevel(i,  m)  =  p  .  +  max  blevel(j,  a(j ))  +  Im  a(  •  — —  (13) 

K>’  jeOUT(i)  c 

a(j),m  _ 

Then  task  priority  is  selected  as  pr(i)  =  min  [tlevel(i,m)  +  blevel(i,m)\ .  The  calculations  and  priority 

m 

updates  are  straightforward  when  the  agent  allocation  is  known,  but  this  is  not  the  case  at  the  beginning 
of  the  algorithm.  In  [Kwok94],  the  selection  of  a(i)  is  not  specified.  Therefore,  we  utilize  the  following 

recursive  calculations:  tlevel(jjn)  =  lt(j,m),blevel(i,m )  =  lb(i,m )  (see  Appendix  A  for  details). 

Level  scheduling  (LS).  The  list  scheduling  based  on  levels  (or  layers )  of  the  task  graph  was  first 
considered  in  [Adam74]  in  the  case  of  identical  agents.  [Shirazi90]  used  this  approach  for  heavy  node 
first  (HNF)  algorithm.  [Iverson95]  applied  this  method  for  heterogeneous  agents.  His  Levelized-Min 
Time  (LMT)  is  a  two-phase  procedure.  The  first  phase  orders  tasks  according  to  their  precedence 
constraints  layer-by-  layer.  The  tasks  in  the  same  layer  are  grouped  and  can  be  executed  in  parallel.  The 
second  phase  is  a  greedy  method  that  schedules  tasks  in  the  same  layer  in  the  decreasing  order  of 
average  computation  cost.  The  agent  that  provides  the  fastest  finish  time  of  a  task  is  selected. 

Dynamic  level  scheduling  (DLS).  The  Dynamic  Level  Scheduling  algorithm  (DLS)  proposed  in 
[Sih93]  assigns  node  priorities  by  using  a  dynamic  level  ( DL )  of  a  task  that  is  equal  to 

DL(i,k )  =  blevel(i)  -  max[DA(i,  k),TF (k)\  +  8  (i,  k)  (14) 

The  variables  in  equation  (14)  are  defined  as  follows: 

a)  blevel (z)  is  equal  to  the  length  of  the  longest  exit  path  from  task  T  ;  only  median  execution  times  pl 
at  each  task  node  among  the  processing  times  of  this  task  on  all  agents  are  used  in  computing  bleveUi) : 

blevel(i)  =  p  +  max  blevel(j)  (15) 

jeOUT  (/) 

(if  the  median  time  is  infinity,  the  largest  feasible  processing  time  is  used); 

b)  8 (i,k)  =  pi-pki  (large  positive  8(7, k)  indicates  that  agent  Ak  executes  task  T  faster  than  most 
agents); 

c)  D/\(i,  k)  is  equal  to  the  time  that  data  from  all  predecessors  of  task  T  arrives  at  agent  Ak ;  and 

d)  TF (k)  is  equal  to  the  time  that  the  last  task  is  executed  by  Ak . 


DLS  algorithm  does  not  assign  priorities  to  tasks  using  the  critical  path.  Instead,  it  performs  an 
exhaustive  matching  of  task  nodes  to  agents.  At  each  scheduling  step,  the  algorithm  selects  a  pair  (i,  k) 
of  a  ready  task  and  an  available  agent  that  maximizes  DL(i,  k ) .  The  algorithm  uses  non-insertion  based 
scheduling,  and,  therefore,  can  be  modified  to  better  utilize  idle  times  in  agent  processing  and 
communication  network  schedules  as  described  in  subsection  6.1.3. 

6.1.2  Information  Routing 

For  the  algorithms  described  in  subsection  6.1.1,  the  information  is  routed  on  a  “first  come  -  first  serve” 
basis.  The  information  ft  .  can  be  scheduled  to  a  link  between  agents  Ar  and  Au  on  which  m  messages 

/'  ,  f.  .  ,...,  f.  .  have  been  scheduled  if  there  exists  some  k  such  that 

J  *1*71  Jl2'J2  J  lm’Jm 

f 

SF(ik+] ,  jk+1  ,r,u)  —  ma x{FF(ik ,  jk ,  r,  u),  FT (/)}  >  — - ,  k  =  1 ,...,/«  (16) 

Cr,u 

where  SF(im+1 ,  jm+1 ,  r,u)  =  °°  .  The  start  time  of  information  fi  .  on  a  link  between  agents  Ar  and  Au  is 
given  by 

SF(i,  j,  r,  u)  =  max {FF(is ,  js ,  r,  u),  FT (/)}  (17) 

/.  . 

where  5  is  the  smallest  k  satisfying  the  inequality  (16).  Hence,  FF (, i ,  j,  r,  u )  =  SF (i,  j,  r,  u)  +  — — . 


6.1.3  Agent  Selection  ( Step  2) 

For  algorithms  described  in  subsection  6.1.1  (except  for  DLS),  an  agent  is  selected  to  process  a  task 
to  minimize  the  finish  time  of  this  task.  The  start  time  ST(j,u)  of  scheduling  task  77  to  agent  Au  to 

minimize  the  finish  time  of  77  can  be  found  via 

ST(j,u)  =  maxjt  I  Vx  e  [t,t  +  p  ):  Wu  >W(u,x)  +  wj\  (18) 

t>  A  ’J  1 

where  A=  max  FF(i,  j,a(i),u).  The  task  is  then  assigned  to  an  agent  A,  so  that  its  finish  time  is 

ieIN(j) 

minimized:  k  =  arg  min  FT(j,u) .  Hence,  ST(j)  =  ST(j,k ) . 


6.1.4  Updates 

After  the  assignment  has  been  determined,  scheduling  and  mapping  variables  are  updated.  First, 
information  transfer  statistics  and  the  corresponding  memory  loads  (commensurate  with  the  new 
assignment)  are  updated: 

For  Vr  and  Vi  such  that  xi  r  =1,  e’l  ]  e  Et ,  fij  ^0  we  have: 


SF(i,  j )  =  SF(i,  j,  r ,  k),  FF(i,  j )  =  FF(i,  j,  r,  k) 


(19) 


M(r,t)  <-  M(r,t)  +  m,  for  Vf  e  [FT(i),SF(i,  ;))  (20) 

M(k,t)  <-  M(k,t)  +  mi  and  £(r,k)  <-  uj/,  .}  for  Vfe  [FF(i,;),5r(;))  (21) 

Second,  assignment  variables  and  workload  data  corresponding  to  new  assignment  are  updated: 

x.k=l,FT(j)  =  ST(j)  +  pkj  (22) 

W(k,t)^W(k,t)  +  Wj  for  Vf  e  [Sr(;),Fr(;))  (23) 


6.2  Heterogeneous  Dynamic  Bottom  Level  (HDBL)  Algorithm 
6.2.1  Agent-Task  Selection 

Algorithms  that  assign  tasks  using  a  static  upward  rank  either  completely  neglect  the  mapping  of  task 
flow  graph  onto  a  heterogeneous  agent  network  (MH,  DS)  by  using  the  average  of  processing  time  and 
link  rate,  or  use  the  “best”  mapping  for  priority  calculation  while  disregarding  the  load  balancing  issue 
(DCP).  Calculation  of  priorities  in  DLS  fails  to  capture  the  real  relation  of  upward  rank  to  the  start  time 
of  the  task;  for  instance,  calculation  of  the  static  upward  rank  using  median  values  without  considering 
communication  and  network  topology  degrades  the  performance. 

In  this  paper,  we  propose  HDBL  algorithm,  in  which  task  mapping  to  every  feasible  agent  is  considered. 
As  a  result,  steps  1  and  2  of  list  scheduling  heuristic  are  combined  to  find  a  task-agent  pair.  For  each 
ready  task  7) ,  an  agent  Ar(i)  is  selected  to  minimize  a  value  identifying  the  schedule  delay  introduced 

by  assigning  it  to  this  agent.  A  straightforward  approach  is  to  select  r{i)  =  argmm[ST(i,k)  +  lb(i,k)] . 

k 

Then  a  task  is  selected  for  scheduling  that  maximizes  the  mapped  critical  path  value: 
i*  =  arg  max  lb  (L  r(i)) ,  and  Tjt  is  allocated  to  agent  Ar(in  (hence,  a(F')  =  r(i *) ).  In  our  algorithm,  we 

is  READY 
lb(i,r(i))*°° 

use  the  following  rules  that  utilize  the  network  topology  in  a  more  efficient  manner: 

•  Agent-task  selection:  r{i)  =  argmin[ST(/,k)  +  blevel(i,k)]  (24) 

k 

•  Task  selection:  i*=  argmin  [ST(i,r(i))-blevel(i,r(i))\  (25) 

ie  READY 
blevel(i,r(i))&°° 


(  f  Y 

Here, 

blevel(i,k )  =  p,  .  +  max 

jeOUT(i) 

mean 

m 

blevel(j,m)  + 1 mk  ■ 

v  Ck’m  J- 

where  mean(F[m] )  is  equal  to  the  mean  of  {F\m  \ :  F[m ]  ^  °°},  and  Imk  =  \ 

m  ’  1 1,  otherwise 


(26) 


The  earliest  start  time  ST(i,k)  of  task  T  on  agent  Ak  is  computed  dynamically  using  message  routing 
strategy  for  single  task  allocation  (as  described  in  subsection  6.2.2)  to  obtain  the  time  of  arrival  of  all 
information  to  agent  Ak ,  equal  to  max  FF(j,i),  and  the  available  time  slot  in  agent’s  processing 

jsIN  (t) 


(equation  (18)). 


6.2.2  Information  Routing 

When  a  task  7)  is  to  be  scheduled  on  agent  Au ,  all  information  from  its  predecessors  (that  is,  tasks  from 
the  set  IN(j) )  should  be  communicated  to  agent  Au .  The  set  of  communication  messages  from 
predecessors  can  be  decomposed  according  to  the  agents  to  which  the  corresponding  tasks  are  allocated. 
Each  such  set,  denoted  as  Fr  =  [f] I  i  e  IN(  j),  a(i)  =  r,  f] ;  ^  o},  must  be  mapped  to  a  single 

communication  link  e“u  =<  Ar ,  Au  >  (mapping  is  feasible  iff  cru  ^  0 ).  Instead  of  scheduling  messages 
from  Fr  one-by-one  (as  is  done  in  existing  algorithms,  subsection  6.1),  we  utilize  a  better  message 
routing  strategy  (see  Appendix  B  for  algorithm  details).  The  idea  is  to  use  the  subset-sum  problem  and 
the  structure  of  the  link  idle  times  to  improve  the  link  “packing”,  and,  as  a  result,  minimize  the  arrival  of 
all  information  to  agent  Au . 


6.3  Example:  Agent  Selection  in  HDBL 

A  selection  of  the  agent  that  can  complete  a  chosen  task  at  the  earliest  time  (subsection  6.2)  disregards 
the  network  contention  issues  and  can  result  in  unnecessarily  long  schedules.  The  example  of  Figs.  1,2 
and  Table  III  shows  the  improvement  that  can  be  obtained  by  considering  the  effect  of  agent  allocation 
on  the  critical  path  length  (and,  as  a  result,  the  total  execution  time  of  the  task  graph). 

Table  VI  shows  the  values  of  blevel(i,k) .  Consider  two  iterations  of  the  algorithm  shown  in  Fig.  6. 

TABLE  VI:  Bottom  Level  Values  blevel(i,k) 


k\ 

A 

T 

T 

13 

T 

1  4 

A 

A 

A 

5.5 

16.83 

17.83 

1 

— 

3 

A 

— 

— 

— 

1 

— 

1 

A 

4.75 

7.33 

— 

— 

— 

.5 

A 

— 

— 

— 

3 

3.33 

— 

A 

9 

— 

8.33 

— 

7.33 

2 

Al 

A2 

A3 

2 

A4 

A5 

3 

Figure  6:  Scheduling  Iterations  in  HDBL 


Stage  (a)\  ready  tasks  READY  =[1,5];  the  earliest  start  times  ST(i,k )  and  finish  times  FT(i,k )  on 
agents,  resulting  “agent  efficiency”  ST(i,k)  +blevel(i,k)  and  “delay”  coefficients 
ST (/,  r(i))  -  blevel(i,  r(i))  are  listed  in  Table  VII.  As  a  result,  task  T\  and  agent  A,  are  selected.  Note 
that  the  choice  of  agent  A5  would  minimize  the  finish  time  of  task  7, ,  but  this  would  increase  the  total 
schedule  length  of  the  task  graph:  the  earliest  finish  times  for  successor  task  T4  would  be  [00,00,-, 7,-,-] . 

Stage  (b):  ready  tasks  READY  =  [4,5] ;  the  earliest  start  and  finish  times,  resulting  “agent  efficiency” 
and  “delay”  coefficients  are  shown  in  Table  VIII.  As  a  result,  task  T4  and  agent  A4  are  selected.  This 
agent  mapping  also  minimizes  the  finish  time  of  selected  task.  The  length  of  the  schedule  obtained  by 
HDBL  is  6.5.  All  other  algorithms  result  in  a  schedule  with  total  length  equal  to  10  (the  outputs  of 
HDBL  and  MH  are  shown  in  Fig.  7). 


TABLE  VII:  Mapping  Coefficients  ( a ) 


Earliest  Start  Times: 


T  T 

15 

A 

0  - 

A 

—  — 

A 

1  - 

A 

-  5 

A 

0  2 

Earliest  Finish  Times: 


T  T 

1l  15 

A 

3  - 

A 

—  — 

A 

2.5  - 

A 

-  5.5 

A 

1  4 

Agent  Selection: 

Agent  Efficiency 


Task  Selection: 

Delay  Coefficients 


TABLE  VIII:  Mapping  Coefficients  ( b ) 


Earliest  Start  Times: 


T  T 

A 

OO  - 

A 

7  - 

A 

—  — 

A 

3.5  5 

A 

-  2 

Earliest  Finish  Times: 


T  T 

A 

OO  - 

A 

8  - 

A 

—  — 

A 

6.5  5.5 

A 

-  4 

Agent  Selection: 

Agent  Efficiency 


T  T 

14  15 

A 

OO  — 

A 
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A 

—  — 

A 

A 
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Figure  7 :  Task  Schedule  Gantt-Chart 


7  Simulation  Results 

To  compare  the  performance  of  scheduling  algorithms,  we  utilize  the  Schedule  Length  Ratio  (SLR) 
measure  for  heterogeneous  agent  systems: 

max  FT(i,a(i )) 

SLR  = — W — 7 - ,  (27) 

2™n  Pka 

ieCP 

where  CP  is  the  critical  path  in  the  modified  graph  without  communication  constraints  with  task  node 
computation  cost  equal  to  its  minimum  processing  time  among  agents.  SLR  is  equal  to  the  schedule 
length  (main  performance  measure  of  a  scheduling  algorithm)  normalized  by  its  lower  bound  (equal  to 
the  summation  of  computation  costs  of  nodes  on  the  critical  path). 


7.1  Communication  Versus  Computation  Cost 

We  have  explored  the  effect  of  ratio  between  communication  cost  (and  corresponding  delays)  and  task 
computation  cost.  We  consider  the  simple  example  of  a  task  graph  with  50  task  and  6  agent  nodes: 

•  Task  processing  times  are  identical  for  all  agents  (equal  to  1  unit  of  time). 

•  Task  workload  (=1)  and  task  memory  load  (=0)  are  identical. 

•  Agent  workload  capacity  is  the  same  (=1,  implying  that  each  agent  can  do  only  a  single  task  at  a 
time),  task  knowledge  rate  is  uniform  in  [0.3,1]. 

•  Agent  architecture  is  fully  connected  with  link  rates  uniformly  distributed  in  [1,10]  interval. 

•  Task  communication  cost  is  varied  from  0  units  (no  communication  between  tasks)  to  300  units 
(average  communication  delay  =  60;  communication-intensive  graph). 

•  The  density  of  the  graph  is  varied  by  the  ratio  of  tasks  per  layer,  which  is  uniform  in  [0.0,0.2], 
and  the  number  of  predecessors,  which  is  uniform  in  [0,10].  As  a  result,  the  number  of  layers  is 
between  10  and  50. 


Figure  8:  Communication  Increase  Figure  9:  Computation  Increase 

SLR  increases  with  communication  cost  (denominator  does  not  include  communication  in  its 
calculation).  We  found  that  the  performance  of  other  algorithms  (MH,  DS,  LS,  DLS,  DCP)  is 
approximately  the  same  (with  DLS  consistently  providing  shorter  schedules),  while  HDBL  algorithm 
provides  an  improvement  over  these  algorithms  of  up  to  28%  for  communication-intensive  graphs.  Fig. 
8  shows  the  results  of  average  SLR  ratios  for  200  Monte-Carlo  runs  for  HDBL  algorithm  and  the  best 
solution  of  other  methods  (MH,  DS,  DCP,  LS,  DLS).  Fig.  9  provides  the  results  for  alternative 
comparison:  fixing  the  communication  cost  at  50  units  and  varying  the  task  processing  time  from  10  to 
80  units.  HDBL  still  exhibits  superior  performance  to  all  other  algorithms,  although  its  improvement 
decreases  as  the  obtained  schedule  approaches  a  lower  bound  (hence,  the  optimal  solution). 


7.2  Varied  Communication 

In  subsection  7.2,  we  showed  that  our  algorithm  achieves  more  than  25%  improvement  for 
communication-intensive  graphs  with  equal  communication  cost  for  all  messages.  In  Fig.  10,  we  show 
results  for  graphs  with  varied  communication  cost.  The  minimum  communication  cost  is  set  at  50  units 
(average  delay  =  10),  and  the  maximum  varies  from  50  to  300  units  (average  delay  =  60).  We  can  see 
that  HDBL  provides  improvement  of  20%  to  25%  over  other  algorithms. 


’fe 


r»if*iI)8£»:PLfl,0L3)  ^  ^ 

/  / 

/  / 

—  “■ 

s 

/ 

1*1  JJJ 

wa>  CtnmiuinnCiti 


Figure  10:  Varied  Communication 


8  Conclusions  and  Future  Research 

In  this  paper,  we  have  considered  the  problem  of  mapping  information  task  graphs  onto  an  organization 
of  heterogeneous  agents  under  agent  processing,  network  topology,  workload  and  memory  capacity 


constraints.  We  proposed  the  HDBL  algorithm,  which  outperforms  existing  list  scheduling  heuristics  by 
providing  an  improvement  of  over  25%  for  communication-intensive  task  graphs.  HDBL  algorithm  is  an 
off-line  procedure  that  is  based  on  matching  the  critical  path  in  the  task  graph  to  the  network  of 
heterogeneous  agents,  providing  a  better  priority  evaluation  for  agent-to-task  mapping  and  scheduling.  A 
novel  approach  for  information  routing  further  improves  the  performance  of  the  algorithm  by  efficiently 
distributing  information  transfers  on  communication  links  in  an  agent  network. 

Our  current  research  is  focused  on  exploring  task  graph  structure  as  means  to  improve  its  mapping 
onto  agent’s  architecture.  We  consider  linear  clustering  and  other  graph  decomposition  methods  as  a 
means  to  extract  elementary  subgraphs  for  temporal  scheduling  onto  an  agent  network.  Note  that 
heterogeneous  nature  of  agents  in  general  prevents  mapping  subgraphs,  such  as  linear  clusters  (chains  of 
communicating  tasks),  to  a  single  agent.  Even  when  this  is  possible,  in  many  situations  mapping  tasks  to 
multiple  agents  can  reduce  the  schedule  length.  In  such  cases,  a  trade-off  between  the  optimality  of  local 
mapping  and  communication  link  contention  should  be  considered. 

The  information  routing  in  an  agent  network  is  one  of  the  most  important  factors  in  minimizing 
communication  delays.  In  our  current  model,  only  directly  connected  agents  are  allowed  to 
communicate.  This  can  be  generalized  to  allow  communication  between  any  indirectly  connected  nodes 
by  introducing  the  problem  of  finding  paths  for  the  transfer  of  information  between  these  agents  so  as  to 
minimize  the  aggregated  communication  cost  (or  delay)  that  accounts  for  link  contention.  Note  that  a 
direct  link  between  two  nodes  is  not  necessarily  a  shortest  path  between  them.  We  plan  to  address  these 
issues  in  our  future  research  efforts. 


APPENDIX  A:  Critical  Path  in  Heterogeneous  Networks 

In  the  following,  we  describe  a  procedure  for  finding  the  critical  path  in  a  task  graph  considering  the 
topology  of  the  agent  network.  Due  to  the  heterogeneous  nature  of  agent’s  characteristics  and  network, 
the  generic  critical  path  does  not  describe  the  real  processes  underlying  the  notion  of  a  critical  path  as 
the  longest  processing  sequence  in  the  graph. 

To  find  the  critical  paths  (longest  chains)  in  a  task  graph,  we  need  to  traverse  the  graph  backwards 
storing  the  labels  at  the  task  nodes  associated  with  the  longest  assignment  among  all  exit  paths  leading 
from  this  node  allocated  to  any  feasible  agent.  The  length  of  an  assignment  of  a  particular  exit  path  is 
equal  to  the  shortest  feasible  schedule  of  the  corresponding  chain  graph  on  the  agent  network. 

We  define  s  =  set  of  tasks  that  have  all  successors  labeled.  The  critical  path  algorithm  is  shown  in 
Table  IX.  We  can  see  that  the  length  of  a  critical  path  can  be  recursively  computed: 


(  f  Y 

lb(i,k)  =  pki+  max 

j&OUT  (/') 

min 

m 

lbUM  +  im,k-  — 

^  C k,m 

f  ■ 

Note  that  if  /;  .  =  0  =>  Vk,  m  :  — —  =  0  . 

r 

^  k,m 


If  task-to-agent  assignment  is  known,  then 


lb(i,k)  =  pki  +  max 

jeOUT(i) 


h(j’a(j))  +  Ia(j),k  ■ 


f. 


(29) 


Analogously,  lt  (j,  m)  can  be  computed  recursively  by  traversing  the  task  graph  forward: 


l,  ( j ,  m) 


[0,  if  IN  (j)  =  0  and  pmj  <  °o 
I  oo,  otherwise 


lAj,m)  =  max 

ieIN(j) 


If  the  assignment  of  tasks  is  known,  then 


l,  O',  m)  =  max 

ieIN(j) 


min 

k 


Pkj  +l,(hk)  +  ImJ[ 


A 


Paul,  +l,(i,a(i))  +  I 


m,a(i ) 


Aj 


O  a(i),m 


More  precisely,  we  should  write:  /,  ( j ,  m )  =  max  FF(i,  j,  a(i).m) . 

ieIN(j) 


(30) 

(31) 


(32) 


lt(j,m )  refers  to  the  earliest  possible  time  that  a  task  T  can  start  execution  on  agent  Am.  The  latest 
time  that  a  task  T  can  be  started  on  agent  Ak  so  that  the  start  times  of  tasks  in  the  critical  path  are  not 

delayed  is  equal  to  max.lt(j)  —  lb(i,k)  . 

j 


TABLE  IX. 


T  {[TX  if  OUT(i)  =  0 

Inltiallze:  CP(i,  k)  =  \'i  ,  .  S  =  {i  :  OUT(i)  =  0} 

[0,  otherwise 

CA(i,k )  =  [ k ]  VjeS:  =  pm  j 

while  S  ¥=  0  do 

S*  =  S,  5  =  0 
for  each  y'eS* 
for  each  is  BV( "/) 
for  each  k= 

if  fij  =  0,  then  m*  =  arg  min  [4  (j,m)\  =>  A  =  lb  ( j,m *) 

else  m 


k,  if  lb(j,k)  <  min 

m:m*k\ 


>A  =  lb(j.k) 


arg  min| 

m:  m*k 


40'-'»)t 


otherwise  ^  A  =  lb  ( j,  m* ) + - 


end  if 

if  lb(i,k)>  pkJ+A 

then 

4(<>*)  =  Pkj  +A  CP(i,k)  =  [i,CP(j,m*)] 
CA(i,k)  =  [k,CA(j,m*)] 

end  if 

OUT®  =  OUT(i)\{j } 

if  OUT(i)  =  0,  then  S  =  Su|ij  end  if 

end  for 
end  for 
end  while 


APPENDIX  B:  Scheduling  Multiple  Information  Messages  on  a  Single  Communication  Link 

For  each  agent  Ar  ( r  =£  u ),  we  identify  the  set  S r  which  consists  of  predecessors  of  task  7\  that  are 
assigned  to  A/.  Sr  ={/ 1  i  e  IN(j),a(i)  =  r} .  Then,  a  set  of  information  messages  from  these  tasks 
Fr  -  \fj  .  lie  Sr,  f\  j  ^  0}  must  be  communicated  to  agent  Au  .If  Fr .  ^  0  ,  messages  from  set  Fr  must 

be  scheduled  to  the  link  e‘‘u  =<  Ar,Au  > .  Note  that  this  scheduling  is  feasible  iff  cr  u  ^  0  . 

Let  us  assume  that  the  message  set  E(r,u)  =  \fij,fi^j,...,fi  ;  }  have  been  scheduled  to 
communication  link  ear  u  =<  Ar  ,AU  > .  We  define 


:  FF(h .  ik  )-At  =  SF(h+i .  Jk+ 1 )  -  FF(ik .  jk ) .  for  k  =  1 . m  — 1, 


(33) 


where  f0  =  0,  A0  =  SF(q ,  j1 ) .  If  Ak  ^  0 ,  then  the  interval  \tk  ,tk  +  Ak  J  (called  idle  interval )  can  be  used 
to  allocate  a  set  Frk  of  messages  that  can  be  executed  during  [tk,tk+Ak],  That  is, 
Fr,k  =  1 fij  ■  i e  Sr  k ,  /,  ;  0},  where 


=\i:ieSr,  max{tA. ,  FT  (i) }}+  —  <tk  +Ak 


(34) 


Suppose  we  wish  to  schedule  a  set  F  of  messages  with  lengths  wn  and  release  times  r  during  the  time 
interval  [a,  b ] .  Since  the  objective  of  message  scheduling  is  to  reduce  the  time  of  arrival  of  the  last 
message  to  agent  Au ,  the  problem  of  scheduling  to  a  single  idle  time  interval  [a,b]  is  equivalent  to 
finding  an  allocation  with  maximum  total  scheduling  time,  that  is  max  ^  xn  wn  ,  where  xn  =1  iff  a 

neF 

message  n  is  assigned  to  interval  \a.b\  (=0  otherwise).  The  scheduling  must  take  into  account  the  time 
constraints  of  the  problem  (message  release  times  rn  for  each  message,  non-overlapping  schedule  of 
tasks,  and  interval  length).  When  Vn  e  F  :  rn  <  a ,  the  problem  becomes  a  subset-sum  problem  for  a 
knapsack  with  capacity  C  =  b- a: 


max  J^xnwn 

neF 


YjXnWn 

neF 

Xn  G  {0,1} 


(35) 


Although  this  problem  is  NP-hard,  it  can  be  solved  efficiently.  For  example,  fully  polynomial 
approximation  schemes  have  been  developed  [Martello90]. 

When  3  n  e  F :  rn  >  a ,  we  can  solve  the  problem  by  applying  a  sequence  of  subset-sum  problems 
working  “backwards”  in  the  idle  interval  [ a,b ].  The  corresponding  algorithm  called  Bounded  Interval 
Message  Mapping  (BIMM)  is  described  in  Table  X.  BIMM  algorithm  uses  the  consecutive  application 
of  Subset-Sum  problem  for  time  intervals  determined  from  the  release  times  of  messages. 

For  our  problem,  when  scheduling  set  F  =  Frk  in  an  interval  [a,  b]  =  [tk ,  tk  +  Ak  ] ,  we  have: 


(36) 


rn  =  ST (L  X  wn  =  — ,  and  C  =  Ak. 

r 

^ r,u 

We  propose  the  algorithm  (Table  XI)  that  schedules  information  messages  to  a  communication  link 
utilizing  the  available  idle  time  intervals  in  link  assignment.  The  algorithm  searches  for  time  intervals 
and  assigns  messages  to  them  according  to  BIMM. 


TABLE  XI. 


TABLE  X. 


Input:  Message  set  F,  message  release  times  r  ,  message 
lengths  w.,  time  interval  [a, b] 

Output:  Assigned  messages  F* 

Initialize:  F’=0,  /-'*=  0,  t,=b 

Step  1:  Find  tx  =  maximal  start  time  of  messages  from  F. 

Step  2:  Select  tasks  from  F  with  start  time  equal  to  f ,,  remove 
them  from  F  and  add  to  F’. 

Step  3:  Find  set  F”  of  messages  in  F’  that  can  be  executed 
during  time  interval  [f,,  f2].  That  is: 

F"={i  :  ie  F\  max[q ,  /;]+  wi  <t2  } 


Input:  Set  of  messages  Fr  to  be  assigned  to  a  link  with 
messages  £(,,„)  =  {/..,/.  } 

f 

Initialize:  FT(0)  =  0,  F  =  0,  r,  =  ST(i),  w,  =  — 

C,,u 

for  k=0,...,m-l  do 

a  =  FT(it)  +  wik,b  =  FT(it+l)  and  define  a  set  of 
messages  Fr  k 

if  ct<b,  then  F  =  F\jFrk  and  solve  BIMM,  obtaining  set 
S*  of  assigned  tasks. 

Update:  F  =  F\S* 

end  for 


Step  4:  Solve  Subset-Sum  problem  for  items  from  F”  and 
knapsack  with  capacity  C  =  t2- Obtain  set  of  assigned  items^- 
and  total  assigned  time 

"  "LWi 

ieS 


w  = 


Step  5:  Remove  items  of  S  from  F’  and  add  them  to  F*. 
Update  f2  =  t ,+  C  -  W. 

Step  6:  Repeat  steps  1-5  until  F  =  0 


If  F¥0,  then  assign  remaining  items  in  the  increasing  order  of 
message  release  times: 
a  =  FT(im)+w im 

while  F±0  do 

Select  a  message  i  from  F  with  smallest  release  time  ST(i). 
Update  SF _  max(a,  FT(i)),FF(i,  j )  =  SF(i,  j)  +  w, 

.  ...  a  =  FF(i,j) 

end  while 
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